Solving the resource constrained project scheduling problem with quantum annealing

Quantum annealing emerges as a promising approach for tackling complex scheduling problems such as the resource-constrained project scheduling problem (RCPSP). This study represents the first application of quantum annealing to solve the RCPSP, analyzing 12 well-known mixed integer linear programming (MILP) formulations and converting the most qubit-efficient one into a quadratic unconstrained binary optimization (QUBO) model. We then solve this model using the D-wave advantage 6.3 quantum annealer, comparing its performance against classical computer solvers. Our results indicate significant potential, particularly for small to medium-sized instances. Further, we introduce time-to-target and Atos Q-score metrics to evaluate the effectiveness of quantum annealing and reverse quantum annealing. The paper also explores advanced quantum optimization techniques, such as customized anneal schedules, enhancing our understanding and application of quantum computing in operations research.

This paper provides an in-depth exploration of quantum annealing's theoretical and practical aspects, specifically in relation to the RCPSP.Initially, we examine the fundamentals of quantum annealing, focusing on the problem-embedding process into the annealer's qubit graph.We then explain the RCPSP and detail the methods used in this study, including the selection of instance generators from existing literature, evaluation metrics for quantum heuristics and strategies for utilizing a quantum annealer for RCPSP (annealing time, shots, reverse schedule).The results section presents the QUBO model for solving the RCPSP.Finally, we present an extensive computational experiment to assess the efficacy of quantum annealing when solving the RCPSP and discuss key factors such as solution sampling, annealing duration, chain strength, and the use of advanced controls like reverse annealing.In addition to our conclusions, we also present several promising open research questions.This algorithm orchestrates a gradual transition from an initial ground state in H 0 to a state described by the problem Hamiltonian H 1 .The H 1 Hamiltonian mirrors the energy function of the optimization problem, ensuring that the ground state for H 1 corresponds to a minimum-cost solution to the optimization problem P .Introduced by Farhi et al. 27 , QA and in more general the adiabatic quantum model of computation demonstrates that if the transition is executed slowly enough, the algorithm will, with high probability, converge to a ground state, i.e., an optimal solution.D-Wave quantum annealing processors are purposefully engineered for the specific task of identifying minimum-cost solutions to the Ising Minimization problem or, indirectly, to the QUBO-an isomorphic problem.The Ising problem, defined on a graph G = (V , E) , entails the assignment of values from {−1, +1} to spin variables s i with the objective of minimizing the following energy function H 1 : Where h = h i : i ∈ V represent weights, and J = J ij : (i, j) ∈ E is a set of coupling constants.In the physical context, spin variables s i can be seen as magnetic poles, with negative J ij indicating ferromagnetic interactions, and positive values suggesting antiferromagnetic interactions.The optimal configuration of spin variables s i that minimizes the energy function is denoted as a ground state, while alternative configurations are classified as excited states that do not have minimum energy.Upon broadening the computational scope, Ising problems are effortlessly transformed into QUBO problems via s i = 2x i − 1 .This modification involves associating binary decision variables x i ∈ {0, 1} with spin variables s i ∈ {−1, +1}.
A D-Wave Quantum Processor Unit (QPU) maintained at a few millikelvin exhibits quantum properties such as superposition and quantum tunneling.Despite the presence of a Faraday shield, the QPU remains susceptible to interference, which generally reduces the likelihood of attaining a ground state.Consequently, we categorize any D-Wave processor as a heuristic solver, that requires empirical methods for performance analysis.The current most advanced D-Wave processor (Advantage 6.3) has more than 5000 active qubits, a connectivity of 15 qubits (qubits are not fully connected), and 35000 active couplers, made of microscopic loops of niobium.These couplers are connected to a sophisticated analog control system through a network of Josephson junctions [28][29][30] .Table 1 lists the properties of the different topologies made available by D-Wave: Chimera, Pegasus, and Zephyr.
Besides noise and interference, another major challenge is qubit connectivity.In order to solve any problem on a quantum annealer, the problem graph (QUBO/Ising) must be mapped into the physical hardware, which has a limited number of qubits and interconnections.Consequently, it may be necessary to alter the structure of the QUBO/Ising problem to fit the constrained topology of the quantum annealer 31 .This involves mapping the model onto a larger qubit graph, a critical step that stresses the importance of the machine's inherent topology.This non-trivial process is known as "minor embedding".When the QUBO/Ising problem's graph requires couplings among qubits not directly connected in a QPU topology, additional qubits are utilized to form connected sub-graphs representing the target graph.These additional qubits are referred to as "logical qubits".For example, consider the QUBO/Ising graph K 3 shown in Fig. 2a, where qubits x 1 and x 2 need to be coupled with x 3 but lack direct physical connections in the QPU's topology, as despicted by the dotted edges between these nodes.Figure 2b illustrates the minor embedding of K 3 using a new logical qubit.To establish this logical link, the adjusted QUBO/Ising problem includes an additional penalty term, either ρ(z 4 − z 3 ) 2 or ρ(z 4 z 3 ) , where ρ denotes the "chain strength parameter" which helps maintain the integrity of logical qubits during computation.
The new logical links established by the minor embedding process are termed "chains".For the solution to be consistent, it is critical that all physical qubits within a chain attain the same value upon measurement; failure to do so indicates a broken chain.The "chain strength parameter" ρ manages a crucial trade-off: setting it too low risks frequent chain breaks, whereas a value that is set too high may restrict the qubits capacity to transition between states.
Sub-optimal embeddings can lead to unnecessarily long chains, as depicted in Fig. 2c.This embedding requires the use of two additional logical qubits, along with the inclusion of extra penalty terms ρ(y 4 y 3 ) and ρ(y 4 y 5 ) , thereby increasing the likelihood of chain breaks.The presence of multiple broken chains often indicates that the solutions yielded by the annealing process may not accurately represent the original problem.As demonstrated by Marshal et al. 32 , poor embeddings, particularly those characterized by long chains, can complicate the solution sampling process.Although various post-processing techniques exist to mend broken chains, such as majority voting or Monte Carlo resampling, this work exclusively considers unbroken chains.Samples with broken chains were discarded, impacting only runtime performance and not the number of valid solutions, regardless of whether discarded samples were feasible or not.
A wide range of heuristics for finding embeddings are available in the literature.Moreover, the quest for better embedding algorithms remains an active area of research [33][34][35] .Notably, Bernal et al. 36 have demonstrated that more effective embeddings can be achieved through integer programming and decomposition methods.In this study, minor embeddings were identified using D-Wave's available heuristic 37 .Figure 3 illustrates an example of the embedding process using the "minor-miner" heuristic on the "Pegasus" topology.
Once the minor embedding process is successfully completed, the annealing process is initiated.While the adiabatic theorem suggests a potentially lengthy annealing duration to ensure the system stays in the ground state, practical constraints arise due to noise, which can elevate the system to (unwanted) higher energy states.Consequently, a brief annealing is conducted and repeated multiple times in a stochastic process that samples from the energy distribution of the problem.Determining the optimal annealing time and the number of samples depends on the specific problem and its energy distribution.Typical values fall within the range of 10 to 100 microseconds for annealing times and 1 to 10000 for the number of samples.
While the traditional QA starts from the ground state of the "Ising Transverse field" Hamiltonian H 0 , and therefore from an initial uniform superposition and evolves towards the target problem Hamiltonian H 1 , a more sophisticated evolution approach can be applied.In Reverse Annealing (RA), the process departs from an already available solution.RA reintroduces partial segments of H 0 by operating in reverse, thereby reinstating a partial quantum superposition.This unique reverse progression is employed to iteratively refine the initial solution [38][39][40][41] .

Resource-constrained project scheduling problem
The RCPSP is one of the most extensively studied scheduling problems and perhaps one of the easiest to describe; however despite this apparent simplicity, the RCPSP conceals its true complexity, as demonstrated by Blazewicz et al. 42 , who show that the RCPSP is NP-hard.This inherent complexity renders the RCPSP as one of the most intractable combinatorial optimization problems.Similar to the JSSP 43,44 the RCPSP falls into the category of problems classified as NP-hard "in the strong sense" 45 .
In essence, the RCPSP considers the scheduling of a single project comprising n non-dummy activities, subject to precedence and resource constraints, with the overarching objective of minimizing the project makespan-the total time required for the completion of all activities.The RCPSP is typically represented by a graph G(A, E) with each node in A = {0, 1, ..., n + 1} that corresponds to the different project activities and each edge in (i, j) ∈ E equivalent to a straightforward finish-to-start precedence relationship, meaning that the beginning of a successor activity j must await the completion of its predecessor activity i .Nodes 0 and n + 1 serve as sym- bolic milestones, representing the "project start" and "project finish", respectively.These milestone activities are often referred to "dummy activities".Each activity j ∈ A has a duration p j and resource consumption b jk , where k belongs to a set of renewable resources R .Each resource k has a maximum capacity B k .A feasible solution to the RCPSP corresponds to a project schedule S = {S 0 , S 1 , . . ., S n+1 } comprised of start times S j (for each activ- ity j ∈ A ) that respect both precedence and resources constraints.An inherent characteristic of the RCPSP is the non-preemptive nature of activities, indicating that once an activity commences, it cannot be interrupted.
Figure 4a depicts the graphical representation of a small instance of RCPSP (known as a network diagram) composed of three activities (plus two "dummy" activities) and two renewable resources.The activity duration, denoted as p j , and resource consumption b jk are presented above and below each node j, respectively.Both renewable resources have a maximum capacity of three units.Figure 4b shows the optimal schedule S = {S 0 = 0, S 1 = 0, S 2 = 1, S 3 = 1, S 4 = 3} for this instance.

Mixed integer linear programming formulations
Given the intrinsic significance of the RCPSP, it is unsurprising that the literature dedicated to Mixed Integer Linear Programing (MILP) formulations for the RCPSP is both prolific and dynamic, offering a myriad of approaches.For a given optimization problem, multiple formulations can be devised.These can be distinguished in three ways: first by the formulation size, notably concerning the number of variables; secondly, by the number of constraints they involve; and thirdly, by the strength of their Linear Programming (LP) relaxation.Typically, for a given problem, a discernible correlation exists between the problem size and the quality of the LP relaxation.Traditionally, enhancing the quality of the LP relaxation requires a new and extended formulation that introduces additional variables and constraints, thus increasing the problem size.With remarkable advancements in LP algorithms, which can now efficiently solve instances involving millions of variables [46][47][48] , it comes as no surprise that, in the context of MILP formulations for the RCPSP, a significant emphasis has been placed on advancing and refining the quality of the LP relaxations 49,50 .
Broadly classified, formulations for the RCPSP fall into three distinct families, as illustrated in Fig. 5.The first category encompasses time-index formulations [51][52][53][54][55][56][57]    by Koné and Artigues 61,62 reveals that sequence-based and event-based formulations tend to be more compact, requiring fewer variables, while time-index formulations, although larger in scale, offer superior Linear Programming (LP) relaxations.It is noteworthy that the majority of time-index formulations tend to be binary and utilize exclusively binary variables, while event-based and sequence-based formulations employ a mix between integer and binary variables.For more general information regarding the different MILP formulations of the RCPSP, please refer to Artigues et al 2013 63 .
In the context of employing a quantum annealer for solving the RCPSP, a pertinent question emerges: Which formulation should one choose among the available options?Specifically, which formulation aligns most effectively with the capabilities of current commercially accessible quantum annealers?Addressing these inquiries necessitates an investigation into the transformation of these formulations into their corresponding QUBO form.This evaluation involves assessing each formulation based on factors such as the resulting QUBO size, the need for additional slack-supplementary variables, and the sparsity of the associated QUBO graph.These characteristics play a pivotal role in determining the suitability of each formulation for solving the RCPSP, particularly within the confines of NISQ-era quantum annealers.This study aims to provide comprehensive insights to answer these critical questions.

Methods
In the following section, we explain different aspects of the experimental methodology followed in this study.

RCPSP instance selection protocol
A classical study for assessing the performance of solving the RCPSP would normally involve instances from the well-known PSPLIB dataset of Kolisch et al. 64,65 or alternatively, the more recent CV dataset of Coelho and  www.nature.com/scientificreports/Vanhoucke 66 .Both datasets offer a large number of hard instances of varying sizes (where size is defined by the number of project activities, ranging from 20 to 120 activities).However, considering the limitations on the number of available qubits in the D-wave Advantage 6.3 Quantum Annealer, we decided to utilize the RanGen instance generator proposed by Demeulemeester, Vanhoucke, and Herroelen 67,68 .This generator allows the creation of random instances with varying levels of difficulty.
The instances used on this work were generated following the well-established protocol proposed by Baptiste and Le Pape 69 , which involves constructing both disjunctive and cumulative instances.Disjunctive instances are characterized by a large number of precedence constraints, resulting in a highly sequential schedule with limited opportunities for parallel execution.On the other hand, cumulative instances exhibit fewer precedence constraints and provide ample opportunities for parallelism and posing a greater challenge for solving.
The RanGen generator facilitates the generation of both disjunctive and cumulative instances using the "Order Strength" (OS) parameter, ranging from 0 to 1.A value of 0 indicates a fully cumulative instance, illustrated in Fig. 6a, while 1 represents a fully disjunctive instance, illustrated in Fig. 6b.The generator also allows tuning the resource constraints by specifying the number of resources, in addition to the level of constraint for renewable resources.This is done via a parameter called "Resource Constrainedness" (RC) that again ranges from 0 to 1, where 1 means that all activities consume the maximum resource capacity B k , while 0 means that there is no resource consumption by the project activities.
In our study, we utilized RanGen to produce instances of the RCPSP across various dimensions determined by the number of non-dummy activities (i.e., n ∈ {3, 4, 5, 6, 7, 8} ).Within each size category, we generated three distinct instance types characterized by OS values of 0.1, 0.5, and 0.9.We refer to these instance types as "Cumulative", "Medium OS", and "Disjunctive", respectively.Each instance features two renewable resources (k = 2) with  www.nature.com/scientificreports/ a RC set to 0.5.The activity duration p j and resource consumption values b jk were constrained within the range of 1 to 2 units to further limit the number of qubits required by the quantum annealer.Furthermore, for every instance, the maximum resource capacity B k was established at 3 units.

Benchmarking metrics for quantum optimization
Benchmarking metrics play a pivotal role in the evaluation of adiabatic evolution computing algorithms, particularly for quantum annealing.In the context of benchmarking, there are inherent challenges that arise from the unique nature of quantum annealers.First, QA blends quantum and analog elements that are devoid of discrete instructions or basic operations that lend themselves to conventional counting methods used by classical computers.Consequently, given the transient and unstable nature of QA, relying on runtime as an evaluation metric becomes a pragmatic approach.Specially when considering that runtime is widely recognized as one of the most critical metrics for assessing algorithm performance.
A further complication arises from the juxtaposition of hardware-implemented quantum annealing algorithms against their software-implemented counterparts (simulated quantum annealing).Traditional benchmarks for computer platforms, software, and algorithms often fail to account for this mixed scenario, leading to a deficiency in standard guidelines for robust benchmarking.
Unlike classical benchmarks that often distill performance into a single metric, the evaluation of adiabatic evolution algorithms, particularly in the quantum realm, necessitates a more nuanced approach.The performance of a heuristic on a given input is aptly described by a curve delineating the trade-off between computation time and solution quality.This nuanced perspective requires a repertoire of metrics for comprehensive evaluation.
In the empirical evaluation of quantum annealing and adiabatic evolution algorithms, several performance metrics have surfaced in the literature.Notable among these is "Time-to-Solution" 70 (TTS), which centers on the total time required for a solver to identify a ground state (optimal solution) with a sufficiently high probability.One of the biggest disadvantages of TTS, is that it relies on a priori knowledge of the optimal solution, potentially overlooking benefits derived from near-ground state solutions.
Alternatively, "Time-to-Target" (TTT) 71 is a relevant metric that focuses on the overall time required by solvers to attain a target solution energy, which is determined by the energy distribution of the quantum annealing processor.This metric provides a more versatile evaluation and acknowledges solutions that may not align with a predetermined optimal outcome.
Another recent interesting metric is the "Q-score" 72 conceived by the technology consulting company Atos.The Q-score measures the maximum number of qubits effectively employed to solve combinatorial optimization problem.Originally, the Q-score was developed to evaluate the solution of the Max-Cut problem using the Quantum Approximate Optimization Algorithm 73 (QAOA); however, this metric can be readily adapted to assess the performance of quantum annealing, demonstrated by Van der Schoot et al. 67 .The Q-score of a given problem can be then calculated using: With β * = 0.2 (obtained empirically by studying the behavior of QAOA).and β(n) determined by the following ratio: Where C(n) is the average energy output obtained from QA, C r (n) is the average output value obtained from solving the QUBO/Ising problem with a random sampling method, and C max (n) equal to the energy of the ground state.
For achieving specific goals of this paper, metrics of interest include TTT and the Q-score.By concentrating on these, we aim to ascertain the potential for speed-ups and determine the largest instance of the RCPSP, measured in terms of the number of project activities, that can be effectively addressed using the current Advantage 6.3 Quantum Annealing system from D-Wave.

QUBO penalty selection strategy
Adjusting the values of the multipliers by weighing the penalties due to relaxed constraints is an important but difficult step.Given a QUBO problem where E(x) = x T Qx , with a number of n binary variables x i , ∀i ∈ X , we can further divide it into two portions: E(x) = v(x) + c(x) , where v(x) corresponds to the energy contribution to the objective function, c(x) corresponds to the energy contribution due to the problem constraints, and represents the penalty weight.The main goal this becomes finding the value of such that the optimal solution to the penalised objective function is the optimal solution of the original constrained problem.Multiple strategies have been proposed in the literature, each offering a unique approach to determining the most effective penalty coefficients.
One of the earliest and simplest strategies, proposed by Lucas in 2014 74 , involves utilizing the upper bound of the pure objective function.This is mathematically represented as = x T Qx , where x i = 1, ∀i ∈ X (i.e., the solution where all binary decision values are set to 1 is used as an upper bound of the energy minimization problem).This approach provides a straightforward and easily computable penalty.
Lucas also introduced another possibility for penalty selection which involves using the maximum QUBO value; this is denoted as: (1) Vol www.nature.com/scientificreports/This method takes into account the highest interaction value between the variables, ensuring that the penalty is significant enough to enforce the constraints effectively.More recently, Verma and Lewis 75 proposed a more sophisticated method in 2020.Their approach estimates the potential gain or loss in the objective function value that could result from switching a particular bit on or off.This method provides a more nuanced and dynamic way of calculating penalties, potentially leading to more accurate and efficient solutions, especially in complex scenarios where the impact of each binary variable on the objective function is not uniform.
Therefore the array of strategies discussed underscores the dynamic and continuously evolving landscape of penalty selection in QUBO problems.Each approach, distinct in its methodology, has its own set of strengths and is tailored to suit different problem types.As penalty selection stands as a vibrant and ongoing field of research, our study opts for a more foundational approach.We employ a simple penalty selection method where the penalties are just a multiple of the sum of activity duration's p i , = n j=1 p i , leveraging the upper bound strategy, to ensure clarity and ease of implementation during our analysis.

Experimental setup
The experimental results presented in this study were derived from the utilization of the D-Wave Advantage 6.3 Quantum Annealer, featuring 5640 qubits and a connectivity of 15 connections per qubit.The minor embeddings were established employing D-Wave's "minor-miner" heuristic.Each instance generated using the approach explained in "Experimental results" section was solved on the QPU with a specific annealing time of 20µs (the decision to use 20µs is based on the analysis conducted in "Anneal time and pausing effects" section and is also shown in Fig. 18).Furthermore, for every instance, 10000 samples were recorded.It is important to note that the mentioned annealing time exclusively pertains to quantum annealing and does not extend to Reverse Quantum Annealing (RQA), since the latter uses an specific annealing schedule described next.The impact of including pauses in the annealing schedule on the performance of QA was assessed in "Anneal time and pausing effects" section.An annealing time of 20µ s was employed, incorporating pauses of 2, 4, 6, 8, 12, 16, and 18µ s, respectively.The resulting annealing schedules are visualized in Fig. 8.This analysis was comprehensive, and covered all instance types and sizes.The evaluation involved recording the relative difference between the energy of the ground state E 0 and the minimum energy value achieved by QA, denoted as E min .This difference is calculated as . For each instance, 1000 samples were recorded.A similar analysis was conducted to evaluate the effect of different annealing times 1, 5, 10, 20, 50, and 100µs in the performance of QA. (3)

QUBO analysis of MILP formulations for the RCPSP
In this section, we identify the optimal QUBO formulation for the RCPSP, focusing on the formulation that requires the fewest qubits.The minimal number of qubits is directly related to the number of slack variables, the type of variables, and the sparsity of the QUBO graph.Initially, we examine these crucial elements across three types of RCPSP formulations.Subsequently, from the eight formulations studied within the selected type, we determine the most effective one.

Analysis of formulation types
Once transformed into QUBO formulations, time-index, sequence, and event formulations (see Fig. 5) are compared with the data presented in Table C2.The number of qubits required for each formulation is reported in Fig. 9, which shows that the time-index formulation PRI69 requires significantly fewer qubits compared to the sequence-and event-based formulations.Moreover, the gap between these formulations increases with the the instance size.These results may appear counter-intuitive when considering the original number of variables required by each formulation, as detailed in Table C2.Notably, sequence-and event-based formulations exhibit a lower number of original variables than time-index formulations; the compactness characteristic of event-based formulations has been documented well by Koné 62 .
The apparent discrepancy in the number of QUBO variables can be elucidated by closely inspecting the nature of the different formulations.The TAM93 formulation requires a priori knowledge of "forbidden sets" F .These are sets of activities that share no precedence constraints and, when scheduled parallely, violate resource constraints.Demeulemeester and Herroelen 2006 76   ) .Consequently, it logically demands more qubits compared to PRI69.This is particularly the case when instance size increases, since the size of the forbidden sets F increases accordingly.
In the ART03 resource flow formulation, instead of using forbidden sets F , integer variables φ k ij (where i, j ∈ A 2 and k ∈ R ) are employed, each requiring ⌊log 2 φ k ij ⌋ + 1 binary qubits for transformation.These variables take part of multiple inequalities for maintaining resource constraints, leading to the need for additional slack integer variables, which also require binary transformation.Similarly, event-based formulations KON11 and ART13 utilize integer variables t e to represent event start times, with KON11 also incorporating extra integers b ek for resource consumption.These variables, also used in several constraints, require further slack variables for binary conversion.

Analysis of time-index formulations
Taking the same RCPSP instances as before, Table C3 presents the outcome resulting from the conversion of eight distinct RCPSP time-index MILP formulations into QUBO.
The number of qubits per formulation is represented by Fig. 10, which shows the efficiency of the PRI69 formulation in comparison to other time-index formulations.This outcome is unsurprising given the inherent simplicity of PRI69 that is characterized by fewer variables, a diminished count of inequalities, and consequently, a reduced number of slack variables.The next favorable option is the MIN98 formulation; however, this choice entails the prerequisite knowledge of feasible sets I .Similar to TAM93, the determination of these sets I entails a worst-case complexity of O(2 n ).
For the other formulations, the various strategies aimed at optimizing solutions for a classical computer prove to be costly in terms of the number of qubits required.We can highlight a few of them here: the CHR87 www.nature.com/scientificreports/formulation modifies the PRI69 formulation by replacing its precedence constraint with additional inequalities; SOU97 introduces "step" variables to signal the start of an activity, while KLE98 considers a "step" formulation based on the MILP of a preemptive RCPSP 56 ; The KLE00_1 and KLE00_2 formulations incorporate "on/off " variables; Lastly, BIA13 integrates continuous binary variables indicating the percentage of activity completion, while MIN98, bearing similarities to TAM93, requires prior knowledge of "feasible sets" I.

RCPSP QUBO
Pritsker et al. 51 proposed PRI69 one of the first formulations for the RCPSP.Although it may initially seem counterintuitive, we have chosen this formulation as the basis for constructing our QUBO based on the analysis in the previous section.We use the notations given in Table 2 to introduce the parameters.
A single type of binary decision variable is considered, denoted as x it , with i ∈ A ∪ {0, n + 1} and t ∈ H .This variable is indexed by both the activities and the associated time.Each element x it , ∀i ∈ A ∪ {0, n + 1} and ∀t ∈ H , takes a {0; 1} value such that: The initial objective function of 51 minimizes the starting time of the dummy-end activity.This function is noted f(x) and its expression is given by ( 6).This is the basis of our QUBO, which will then be completed with penalties corresponding to the relaxations of the constraints.
Moreover, we also consider the work of 11,12 on the JSSP, especially for the precedence constraints and the onestart constraint reformulation.For the latter, we force each activity to start exactly once with the following set of constraints (7).
We relax these constraints for all activities and turn them into one penalty P1 denoted by the expression (8).To ensure that the search for an optimal solution penalizes any infeasible solution, the expression is squared so that satisfying the constraint gives no penalty to the QUBO (i.e., remains 0), and that violating the constraint increases the objective value (which we try to minimize).
We can model the precedence constraints of two consecutive activities using their inequalities (9).
The reformulation of all precedence constraints into a single penalty, denoted as P2 , results in (10).Here, the need to square the decision variables does not arise since any quadratic expression involving two binary variables never yields a negative value.It is easy to understand that the only scenario where such a constraint is not satisfied occurs when two consecutive activities erroneously start at the same time.
x it = 1 if activity i starts at period t, 0 otherwise.www.nature.com/scientificreports/ The JSSP has machine-sharing constraints which can also be modeled by simple quadratic expressions, thus giving a penalty similar to P2.The RCPSP has another difficulty: resource constraints.The inequalities presented in (11) express these constraints, as modeled in 51 .
Since we need a penalty for relaxing the resource constraints, we add slack variables in order to reach an equality for each inequality of (11).Slack variables are noted z tk , t ∈ H, k ∈ R , as a non-negative integer for reformulating resource constraints (11) with the quadratic penalties (12).
Since this reformulation aims to create a QUBO, the slack variables s tk must correspond to binary variables.Here, we can consider the minimum value of z tk as zero and the maximum value as B k .The related binary expression is given by equation ( 13), where y is a binary vector (each y i takes value in {0,1}) and where the integer function f (α) gives the required maximum power of 2 with α as the target integer.

Experimental results
Figure 11 illustrates the TTT evolution for a range of optimization methods applied to a specific cumulative instance involving six non-dummy activities and two resources.Accompanying this, Fig. 12 presents the network diagram relevant to this instance.The figures are organized into a tripartite panel, demonstrating the progression of the optimization process across different energy quantiles.This progression is depicted from left to right, indicating a transition from higher to lower energy levels.
It is noteworthy that QA and RQA exhibit discernible patterns, contrasting with the Random Sampling (RS) behavior observed in the evolution curve of RS.This observation suggests that QA and RQA are not mere random samplers.
For the specific instance presented in Fig. 11, RQA demonstrates exceptional performance and outperforms all other optimization methods, including the widely used commercial solver GUROBI.This superior performance of RQA is particularly noticeable in the third panel of Fig. 11, where, the distinctive red line marked with " △ " symbols, representing RQA, reaches the ground state energy faster than any other method.Furthermore, Fig. 13 illustrates the optimal schedule derived using RQA.This schedule results in an optimal project make span of three days.
Despite the impressive performance of QA and RQA for the instance depicted in Fig. 11, it is crucial to remember that both QA and RQA are heuristic methods and consequently do not guarantee the finding of ground states or global optimal solutions.This limitation is evident in Fig. 14, which showcases the relative deviation between the minimum energy solution obtained by QA and RQA (i.e., E min ) and the ground state energy (i.e., E 0 ) obtained by Gurobi.A value of E min −E 0 E 0 = 0 in this metric indicates a successful sampling of the ground state.However, as illustrated in Fig. 7, both QA and RQA struggled to find the ground state energy solutions.Additional data in Figures H11 and H12 in Appendix H demonstrate the time-to-solution (TTS) in seconds for QA and RQA respectively.( 10) Table 3 presents succinct results featuring the mean TTT values for solutions at different energy quantiles (0.9, 0.99, and 0.999).To provide further insight, Fig. 15 visually captures the distinctive mean TTT patterns and highlights the performance variations among the evaluated optimization methods for all instance types.Figure D2, D3, and D4 from Appendix D offer further insight by illustrating the individual TTT performance for the different instance types evaluated in this work (cumulative, medium OS, and disjunctive).
The results presented in Table 3 and Fig. D4, D2, and D3 reveal interesting dynamics.It is evident that QA and RQA face challenges in maintaining superiority, especially when tasked with finding ground states in instances of larger sizes.Despite this, in mere fractions of a second, both QA and, notably, RQA demonstrate their capability to provide high-quality solutions approaching the proximity of ground states.
Overall, for the high-energy quantiles, GUROBI exhibits consistent superiority over SA, QA, RQA, and other free solvers.However, this trend does not extend to certain freely available solvers, such as GLPK and CBC.Notably, in scenarios involving lower energy quantiles, both QA and RQA demonstrate significant advantages over these free solvers.This is particularly evident in the left ( q = 0.9 ) and middle ( q = 0.99 ) panels of Fig. 15, where QA and RQA outperform their counterparts in reaching lower energy states more efficiently.
Contrary to the findings reported by Carugno et al. 12 in the context of the JSSP, our results indicate that SA surpasses both QA and RQA in terms of performance across most instance sizes and energy quantiles.In agreement with their observations, however, RQA does tend to show superior performance compared to QA, particularly in lower energy quantiles.
Adapting the Q-score methodology for QA, as proposed by Van der Schoot et al. 67 , we computed the β(n) values for various instance sizes and types examined in our study.Analysis of Fig. 16 reveals that the β(n) values for both QA and RQA consistently exceed the critical threshold of 0.2.However, is important to remind the reader that this threshold was empirically established based on the solution to instances of the the Max-Cut problem using QAOA.Therefore, its validity is based solely on these empirical findings and is without additional theoretical support.We, therefore, conclude that the estimated maximum instance size of the RCPSP that can be solved using a D-Wave quantum annealer peaks at seven non-dummy activities.Although the β(n) values substantially exceed the critical threshold, we observe a rapid decline in solution quality beyond this activity count.This deterioration is evident when examining the mean "chain-break" percentage reported by the D-Wave system.For instances with eight activities or more, the average chain-break percentage escalates to approximately 74%, indicating a significant breakdown among the logical qubits that were utilized to map the original QUBO graph onto the QPU topology.With such a high percentage of chain breaks, the reliability of the annealer in solving the intended optimization problem becomes questionable, despite the higher β(n) values observed.

Anneal time and pausing effects
In the preceding section, our experimental approach was predicated on determining an optimal annealing time for QA.This determination was made empirically, analyzing the performance of QA across a spectrum of annealing times: 1, 5, 10, 20, 50, and 100µs.www.nature.com/scientificreports/ The performance evaluation of QA for each annealing time was determined based on a relative energy difference metric.This metric juxtaposes the minimum energy achieved through QA against the ground state energy for the given instances as E min −E 0 E 0 .Our analysis spanned instance with 3, 4, 5, 6, 7, and 8 non-dummy activities and all instance types (Cumulative, Medium OS, and Disjunctive).We collected 1000 samples for each configuration.
Insights into the mean behavior of QA across these different annealing times are presented in Fig. 17.This line graph illustrates the relationship between the instance size (x-axis) and the relative energy differences (y-axis), offering a visual representation of the performance variations across instance sizes.
Complementing this analysis, Fig. 18 presents a holistic view of the mean behavior across all instance sizes and types.Here, the annealing times are mapped along the x-axis, and the corresponding relative energy differences are charted on the y-axis.In accordance with 71 , this graph reveals that an annealing time of approximately 20µs emerges as optimal.Beyond this threshold, there is a discernible decline in QA performance.
Given these findings, we chose to conduct our experiments with an annealing time fixed at 20µs .This decision was guided by empirical evidence, suggesting that this duration strikes an effective balance in optimizing QA performance.Figures E5, E6, and E7 illustrate the effect of annealing times for Cumulative, Medium OS, and Disjunctive instance types.
In continuation of our exploration into the multiple parameters involved in QA, we extended the investigation to assess the impact of annealing pauses on QA performance.This analysis aligns with the relative energy difference framework established earlier, comparing the minimum energy achieved via QA to the ground state energy for a particular instance.
Our focus centered on a fixed annealing time of 20µs to examine the effects of pauses at intervals of 2, 4, 8, 12, 16, and 18µs (refer to Fig. 8 for a visual representation of the annealing schedules after incorporating these pauses).
This investigation was inspired by the findings of Marshal et al. 77 , who report performance enhancements in QA with the inclusion of annealing pauses.Our results, illustrated in Fig. 19, confirm this observation.Figure 19 presents a heat map that encapsulates the QA performance across various instance sizes (3, 4, 5, 6, 7, and 8 non-dummy activities) and pause durations (shown on the y-axis as the percentage of pause relative to the total annealing time).Each cell in this heat map represents the average relative energy value derived from 1000 samples across all instance types (Cumulative, Medium OS, and Disjunctive), providing a comprehensive view of the performance landscape.Intriguingly, Fig. 20 highlights that annealing pauses constituting 20% to 40% of the overall schedule (equating to 4 µs and 8 µs in this context) yield the most favorable performance.Figure F8, F9, and F10 show the effect of annealing pauses for Cumulative, Medium OS, and Disjunctive instance types, respectively.
It is important to note, however, that the inherent nature of RQA always incorporates a pause in the annealing schedule.Consequently, the insights obtained from this analysis predominantly apply to conventional QA rather than RQA.

Discussion
In this work, we have conducted a comprehensive exploration into the practical implementation of the RCPSP using quantum annealing, specifically by leveraging D-Wave's quantum-computing technology.Our investigation encompasses a range of aspects, from the basics of quantum annealing to advanced techniques like reverse quantum annealing.
To the best of our knowledge, we are the first to address the RCPSP using a quantum annealer, marking a significant contribution to the integration of quantum computing in the field of operations research.Our work underscores the importance of QUBO modeling for solving RCPSP instances on quantum annealing machines.We have conducted a thorough analysis of 12 well-known MILP formulations for RCPSP and converted them to a QUBO format.This includes identifying the most suitable formulation for quantum annealing, specifically the PRI69 formulation, and providing the corresponding QUBO model.
It is worth noting that the QUBO model ( 14) derived in "RCPSP QUBO" section can easily be adapted to other variants of the RCPSP, as in the case of the 1-preemptive resource-constrained project scheduling problem (1_PRCPSP).By adapting the formulation proposed by Ballestín et al. 78 , we obtain the QUBO f (x) 1_PRCPSP QUBO (presented in Appendix G1).The problem does not suffer from the addition of new slack variables.The new constraints can be modeled as a quadratic product that must remain zero in order to be satisfied.
Other versions of the RCPSP can be based on the QUBO introduced in this study without the need to add many constraints.For example, if resources have periods of unavailability, or if their capacity is reduced ( B k becomes B kt ), it is sufficient to treat these periods as distinct activities with fixed variable values, as shown by Hartmann 79 .This approach can also be applied when the use of resources needs to be stopped without a predetermined period (e.g., for maintenance operations).In such cases, these periods of unavailability should be considered as activities without setting fixed variables.In these scenarios, it is important to establish a precedence constraint for each virtual activity to ensure that the identification of the unavailability period is properly included in the optimization process.However, this modification will require the inclusion of additional slack variables associated with the resource constraint.www.nature.com/scientificreports/ In this work, we solve three distinct categories of problem instances, as defined by the protocol of Baptiste and Le Pape, using RCPSP QUBO formulation (14) on a quantum annealer.Strategies such as developing a reverse quantum annealing schedule are employed.The results are then compared with those obtained from a classical solver.For this comparison, we conduct a detailed exploration of multiple metrics available for evaluating the performance of quantum annealing.Ultimately, we adopt the TTT and the Q-score for our evaluation.Additionally, we determine the maximum instance size that can still be solved with a machine equipped with 5000 qubits.Another key discovery from our research underscores the potential benefits of quantum annealing, especially in scenarios with constrained time limits, such as those encountered in online operations research problems.In addition to our primary research focus, we conduct an extensive and meticulous analysis to understand the impact of annealing times and the role of pauses within annealing schedules on the performance of QA.
While our research offers valuable insights, it's important to recognize its limitations.These include using problem instances of limited size, while excluding other promising quantum-computing techniques like QAOA 73

Size
Method q = 0.9 q = 0.95 q = 0.99 q = 0.999 q = 0.9999 due to hardware constraints, as well as the dependence on the "minor-miner" heuristic for the embedding procedure.
In alignment with the work of Venturelli et al. 11 and Carugno et al 12 on the JSSP, we recognize the potential for significant reduction in time horizons for the RCPSP.This reduction can be achieved by pre-establishing upper bounds, derived from heuristics that can be calculated in polynomial time 80 .Such bounds can considerably reduce the number of variables required by time-index formulations like "PRI69".However, in this study, our focus was centered on examining the intrinsic performance of QA and RQA without incorporating these heuristic techniques.We also did not explore the use of broken-chain correction techniques like the ones suggested by Marshal et al. 32 .  .β(n) for all instances with 3, 4, 5, 6, 7, and 8 non-dummy activities for all instance types (Cumulative, Medium OS, and Disjunctive).

Figure 1 .
Figure 1.Schematic overview of the methodology employed in this study.The study starts, by searching the most qubit efficient formulation for QA, then moves towards the search of adequate parameters and embedding, to then finally proceed to solve different instances of the RCPSP via QA.

2 Figure 2 .
Figure 2. Three graphs representing the same logical QUBO/Ising, but using different embeddings.(a) represents the K 3 original QUBO/Ising problem.The dashed of the edges (x1-x 3 ) and (x2-x 3 ) indicate that the corresponding qubits are not physically connected in the QPU.(b)shows a minor embedding of (a) with the use of two logical qubits z3 and z4, that can be recognized by their dotted border.(c) is a sub-optimal embedding of (a) that uses three logical qubits instead of two.
(a) Pegasus topology of the QPU.(b) RCPSP problem mapped into the QPU Pegasus topology.

Figure 3 .
Figure 3. Minor embedding process with the Pegasus topology.
Resource profile for resource 2.

Figure 4 .
Figure 4. RCPSP instance example: (a) RCPSP instance graph G(A, E) with three non-dummy activities and two resources; (b) solution schedule for the instance; (c) resource profile for resource 1, and (d) resource profile for resource 2.

Figure 5 .
Figure 5. RCPSP MILP formulations reported in the literature categorized by their family type.

Figure 6 .
Figure 6.Baptiste and Le Pape instance types: (a) the left graph corresponds to a cumulative RCPSP instance; (b) right graph corresponds to a disjunctive RCPSP instance.

Figure 7 .
Figure 7. Reverse quantum annealing schedule used for the RQA numerical results of this work.

Figure 11 .
Figure11.Evolution of TTT in seconds for a cumulative instance (OS = 0.1) of six non-dummy activities and two resources.The three panels show the evolution for different energy ranges, where energy decreases from left to right.The right-most panel shows the energy range for the ground state of this instance.From the same panel, it can be seen that RQA is able to find the ground state energy equivalent to 1925 with a TTT of 0.2 seconds.

Figure 12 . 3 B 2 = 3 (
Figure 12.Network diagram cumulative instance (OS = 0.1) with six non-dummy activities and two resources.The duration of each activity is displayed above each node, while the resource consumption is displayed below.

Figure 13 .
Figure13.Optimal schedule obtained from RQA for the cumulative instance (OS = 0.1) with six non-dummy activities and two resources, described in Fig.11.This optimal solution was obtained in a TTT of 0.2 seconds.The optimal schedule corresponds to a ground state energy of 1925.Please see table I4 in the appendix I for detailed values of the ground state energies for each instance.

Figure 14 . 0 = 0
Figure 14.Relative energy deviation from the ground state measured as E min −E 0 E 0

Figure 17 .
Figure 17.Mean effect of annealing time on QA performance, measured as the mean relative deviation from the ground state E min −E 0 E 0 for all instance types (Cumulative, Medium OS, and Disjunctive).A value of 0 on the y-axis indicates an energy equal to the ground state energy.Please see table I4 in the appendix I for detailed values of the ground state energies for each instance.

Figure 18 .
Figure 18.The figure illustrates the impact of annealing times on QA performance, depicted across various instance sizes and types as a function of the mean relative deviation from the ground state E min −E 0 E 0 on the y-axis.This relationship is represented by a green line, accompanied by its standard deviation delineated as a light green contour.The x-axis quantifies the annealing time.A value of 0 on the y-axis indicates an energy equal to the ground state energy.Please see table I4 in the appendix I for detailed values of the ground state energies for each instance.

Figure 19 .
Figure 19.Heatmap of the effect of pauses included in the annealing schedule vs the relative mean deviation from the ground state energy E min −E 0 E 0 for all instance types.The y-axis shows the percentage of pause in the annealing schedule, as shown in Fig. 8.A value of 0, represented by dark blue in the heatmap, indicates an energy equal to the ground state.Please see table I4 in the appendix I for detailed values of the ground state energies for each instance.

Figure 20 .
Figure 20.General effect of pauses included in the annealing schedule vs the relative deviation from the ground state energy E min −E 0 E 0 for all instances.The y-axis shows the percentage of pause in the annealing schedule, as shown in Fig. 8.

Table 1 .
Characteristics of available D-wave quantum devices.

Table 2 .
RCPSP parameters used in the PRI69 time-index formulation.
TMaximum number of time periods.A Set of activities.H H = 0, 1, ..., T is the scheduling horizon.E Set of edges representing a precedence relationship between two activities.For instance, if i and j are two activities of A , and if activity i must be finished to start activity j, (i, j) ∈ E. R Set of resources.p i Processing time of the activity i ∈ A. B k Capacity of the resource k, k ∈ R. b ik Activity i consumption of the resource k ∈ R and its capacity B k .Vol.:(0123456789) Scientific Reports | (2024) 14:16784 | https://doi.org/10.1038/s41598-024-67168-6 11.This optimal solution was obtained in a TTT of 0.2 seconds.The optimal schedule corresponds to a ground state energy of 1925.Please see tableI4in the appendix I for detailed values of the ground state energies for each instance.

Table 3 .
TTT (s) mean results for disjunctive, cumulative, and medium OS instances.